51 research outputs found
A statistically principled approach to histogram segmentation
This paper outlines a statistically principled approach to clustering one dimensional data. Given a dataset, the idea is to fit a density function that is as simple as possible, but still compatible with the data. Simplicity is measured in terms of a standard smoothness functional. Data-compatibility is given a precise meaning in terms of distribution-free statistics based on the empirical distribution function. The main advantages of this approach are that (i) it involves a single decision-parameter which has a clear statistical interpretation, and (ii) there is no need to make a priori assumptions about the number or shape of the clusters
Propagating uncertainty in tree-based load forecasts
This paper discusses the use of ensembles of regression trees as a straightforward but versatile methodology to generate short term (day-ahead) load forecasts for real data from the Global Energy Forecasting Competition 2014. Since temperature is a strong predictor of load, we investigate how forecast uncertainty in temperature can affect the performance of the prediction model. To this end, a singular value decomposition (SVD) based approach is harnessed to simulate noisy but realistic temperature profiles. Our results show that as long as uncertainty is not exceedingly large, it is worthwhile to include temperature forecasts as predictors
Self-stabilized fast gossiping algorithms
In this article, we explore the topic of extending aggregate computation in distributed networks with selfstabilizing properties to withstand network dynamics. Existing research suggests that fast gossiping algorithms, based on the properties of order statistics applied to families of exponential random variables, are a viable solution for computing functions of the values stored in the network. We focus on the specific case in which network changes and failures occur in batches with a minimum frequency in the order of the diameter of the network. Our contribution consists in two self-stabilizing mechanisms, allowing fast gossiping algorithms to be applicable to dynamic networks with minor increase in resources usage. The resulting algorithms can be deployed in networks exhibiting churn, node stop-failures and resets, and random topological changes. The theoretical results are verified with simulations on synthetic data, showcasing desirable properties for large-scale network designers such as scalability, lack of single points of failure, and anonymity
Quantifying volatility reduction in German day-ahead spot market in the period 2006 through 2016
In Europe, Germany is taking the lead in the switch from the conventional to renewable energy.
This poses new challenges as wind and solar energy are fundamentally intermittent, weather-dependent and less predictable.
It is therefore of considerable interest to investigate the evolution of price volatility in this post-transition era.
There are a number of reasons, however, that makes the practical studies difficult.
For instance, EPEX prices can be zero or negative.
Consequently, the standard approach in financial time series analysis to switch to logarithmic measures is inapplicable.
Furthermore, in contrast to the stock market prices which are only available for trading days, EPEX prices cover the whole year, including weekends and holidays.
Accordingly, there is a lot of underlying variability in the data which has nothing to do with volatility, but simply reflects diurnal
activity patterns.
An important distinction of the present work is the application of matrix decomposition techniques, namely the singular value decomposition (SVD), for defining an alternative notion of volatility.
This approach is systematically more robust toward outliers and also the diurnal patterns.
Our observations show that the day-ahead market is becoming less volatile in recent years
Data-driven pattern identification and outlier detection in time series
We address the problem of data-driven pattern identification and outlier detection in time series. To this end, we use singular value decomposition (SVD) which is a well-known technique to compute a low-rank approximation for an arbitrary matrix. By recasting the time series as a matrix it becomes possible to use SVD to highlight the underlying patterns and periodicities. This is done without the need for specifying user-defined parameters. From a data mining perspective, this opens up new ways of analyzing time series in a data-driven, bottom-up fashion. However, in order to get correct results, it is important to understand how the SVD-spectrum of a time series is influenced by various characteristics of the underlying signal and noise. In this paper, we have extended the work in earlier papers by initiating a more systematic analysis of these effects. We then illustrate our findings on some real-life data
One Class Classification for Anomaly Detection: Support Vector Data Description Revisited
The Support Vector Data Description (SVDD) has been
introduced to address the problem of anomaly (or outlier) detection.
It essentially fits the smallest possible sphere around the given
data points, allowing some points to be excluded as outliers.
Whether or not a point is excluded, is governed by a slack variable.
Mathematically, the values for the slack variables are obtained by
minimizing a cost function that balances the size of the sphere
against the penalty associated with outliers. In this paper we argue
that the SVDD slack variables lack a clear geometric meaning, and we
therefore re-analyze the cost function to get a
better insight into the characteristics of the solution. We also introduce
and analyze two new definitions of slack variables and show that
one of the proposed methods behaves more robustly with
respect to outliers, thus providing tighter bounds compared to SVDD
Enabling Future Smart Energy Systems
The on-going transition to more sustainable energy production methods means that we are moving away from a monolithic, centrally controlled model to one in which both production and consumption are progressively decentralised and localised. This in turn gives rise to complex interacting networks. ICT and mathematics will be instrumental in making these networks more efficient and resilient. This article highlights two research areas that we expect will play an important role in these developments
Indexing, learning and content-based retrieval for special purpose image databases
This chapter deals with content-based image retrieval in special purpose image databases. As image data is amassed ever more effortlessly, building efficient systems for searching and browsing of image databases becomes increasingly urgent. We provide an overview of the current state-of-the art by taking a tour along the entir
The influence of the switch from fossil fuels to solar and wind energy on the electricity prices in Germany
Germany is actively pursuing a switch from fossil fuel to renewables, the so-called
Energiewende (energy transition). Due to the fact that the supply of wind and solar energy is
less predictable than the supply of fossil fuel, stabilizing the grid has become more
challenging. On sunny and windy days the supply in Germany substantially exceeds demand,
and the surplus needs to be exported to the neighboring countries. In this study we analyze
data from the German day-ahead market in the period 2009 through 2015 and show that
the realized day-ahead price experiences significant downward pressure from high
predictions for the day-ahead solar and wind supply. This conclusion is based on a regression
analysis using the singular value decomposition (SVD) method. SVD decomposes the time
series as a sum of data-determined profiles.
During the observed period the market share of solar and wind energy in the total energy
supply increased in Germany. The larger the market share, the more impact solar and wind
energy have
- …